Algorithms for Generalized Cluster-wise Linear Regression

نویسندگان

  • Young Woong Park
  • Yan Jiang
  • Diego Klabjan
  • Loren Williams
چکیده

Cluster-wise linear regression (CLR), a clustering problem intertwined with regression, is to find clusters of entities such that the overall sum of squared errors from regressions performed over these clusters is minimized, where each cluster may have different variances. We generalize the CLR problem by allowing each entity to have more than one observation, and refer to it as generalized CLR. We propose an exact mathematical programming based approach relying on column generation, a column generation based heuristic algorithm that clusters predefined groups of entities, a metaheuristic genetic algorithm with adapted Lloyd’s algorithm for K-means clustering, a two-stage approach, and a modified algorithm of Späth [26] for solving generalized CLR. We examine the performance of our algorithms on a stock keeping unit (SKU) clustering problem employed in forecasting halo and cannibalization effects in promotions using real-world retail data from a large supermarket chain. In the SKU clustering problem, the retailer needs to cluster SKUs based on their seasonal effects in response to promotions. The seasonal effects are the results of regressions with predictors being promotion mechanisms and seasonal dummies performed over clusters generated. We compare the performance of all proposed algorithms for the SKU problem with real-world and synthetic data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On cluster-wise fuzzy regression analysis

Since Tanaka et al. (1982) proposed a study of linear regression analysis with a fuzzy model, fuzzy regression analysis has been widely studied and applied in a variety of substantive areas. Regression analysis in the case of heterogeneity of observations is commonly presented in practice. The authors' main goal is to apply fuzzy clustering techniques to fuzzy regression analysis. Fuzzy cluster...

متن کامل

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

Training L1-Regularized Models with Orthant-Wise Passive Descent Algorithms

The `1-regularized sparse model has been popular in machine learning society. The orthant-wise quasi-Newton (OWL-QN) method is a representative fast algorithm for training the model. However, the proof of the convergence has been pointed out to be incorrect by multiple sources, and up until now, its convergence has not been proved at all. In this paper, we propose a stochastic OWL-QN method for...

متن کامل

New Heuristic Algorithms for Solving Single-Vehicle and Multi-Vehicle Generalized Traveling Salesman Problems (GTSP)

Among numerous NP-hard problems, the Traveling Salesman Problem (TSP) has been one of the most explored, yet unknown one. Even a minor modification changes the problem’s status, calling for a different solution. The Generalized Traveling Salesman Problem (GTSP)expands the TSP to a much more complicated form, replacing single nodes with a group or cluster of nodes, where the objective is to fi...

متن کامل

Relevance vector machine and multivariate adaptive regression spline for modelling ultimate capacity of pile foundation

This study examines the capability of the Relevance Vector Machine (RVM) and Multivariate Adaptive Regression Spline (MARS) for prediction of ultimate capacity of driven piles and drilled shafts. RVM is a sparse method for training generalized linear models, while MARS technique is basically an adaptive piece-wise regression approach. In this paper, pile capacity prediction models are developed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1607.01417  شماره 

صفحات  -

تاریخ انتشار 2016